L1 & L2 Regularization effect on models weights
The following histograms display the weights of the 3 layers in the CNN. We can see the effect of L1 and L2 regularization by observing how the weights change over time.
Created on January 18|Last edited on March 19
Comment
Baseline
As expected we can see that the Baseline Model with no regularization contains a wide range of values between [1, -2.5].
Run: Baseline
1
L1 Regularization
We can immediately see how L1 regularization eliminates features in the model. The histograms show a lot narrower density distribution around 0 compared to the baseline. Especially the feature selection is visible by observing how narrow the peak of the weight distribution is getting. The range of the values remains [1, -2].
Run: L1
1
L2 Regularization
The heavy feature shrinkage of L2 regularization is perfectly visible when comparing the histograms with the above mentioned ones. The range of the weights shrinks down from [1, -2] to [0.6, -1.5]
Run: L2
2
Add a comment